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Abstract: Common misconceptions on the Heisenberg principle are reviewed, and the original spirit of 
the principle is reestablished in terms of the trade-off between information retrieved by a measurement and 
disturbance on the measured system. After analyzing the possibility of probabilistically reversible mea- 
surements, along with erasure of information and undoing of disturbance, general information-disturbance 
trade-offs are presented, where the disturbance of the measurement is related to the possibility in principle 
OO ' of undoing its effect. 
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The need for hard miniaturization and the recent discovery of radically new information processing Q, 
Ch I have dramatically changed our attitude towards Quantum Mechanics, which eventually got out the middle 

age of purely academical consideration, to become a relevant chapter of the modern information tech- 
qh| nology. At the beginning "quantum" was a synonymous of "uncertainty", and was considered just as a 

major limitation in nanotechnology. More recently, however, we learned how to turn the "quantum" into 
a powerful horse that we can harness and ride, with unimagined possibilities in principle for guaranteed 

■ cryptographic communications and tremendous speedup of complex computational tasks, giving birth to 
I the new quantum information technology. 

In the theoretical research for quantum information, one of the main programs is undoubtedly to 
establish the actual limitations and controllability of quantum measurements, in a unified framework 
suited to the needs for optimization and engineering. However, looked with not expert eyes, this pro- 
gram should appear quite incompatible with the paradigm itself of quantum mechanics: the so-called 
"Heisenberg principle", which establishes the "participatory" nature of the quantum experiment. In 
fact, according to its popular version — based on the gedanken experiment of the 7-ray microscope ^, 
which was then elevated to "principle" by Ruark Q — it is impossible to measure one variable, say the 
momentum p, of a conjugated pair (e. g. position q and momentum p) without "disturbing" the value 
of the conjugated variable q of an amount Aq no less than the order of h/Ap, where Ap is the accuracy 
of the measurement And such paradigm is not just a folklore for the layman, since the principle is 
clearly stated and emphasized in excellent textbooks of quantum mechanics — e. g. the valuable Messiah 
book which devotes a lengthy section to the "uncontrollable disturbance during the operation of 
measurement" , with an extensive analysis of different thought experiments in support of the generality 
of the "principle", and concluding that "the unpredictable and uncontrollable disturbance suffered by 
the physical system during a measurement is always sufficiently strong that the uncertainty relations 
always hold true." Misunderstanding and misuses even at the level of advanced research are revealed, for 
example, by the controversy ||^, ||, ^, |lO|, |l^, ^ |l^, ^ |l^ on the existence of a "standard quantum limit" 
for precision in monitoring a free mass position — a problem which arose in the field of gravitational wave 
detection. Finally, the controversial nature of the Heisenberg principle is also witnessed by the existence 
of an entire book on quantum measurements [p^ based on the use of the principle beyond its original 
heuristic nature, in contrast to some "classics" of quantum mechanics that not even mention it — e. g. 
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the Landau and Lifshitz book p7|- — whereas, for example, if you look for "uncertainty principle" in the 
subject index of the Peres book J18[ the referred page number is provocatively the page of the index entry 
itself. 

Before proceeding with the discussion on the Heisenberg principle, let me first clarify some common 
confusion between "uncertainty relations" and "uncertainty principle" , the former concerning the statis- 
tics of repeated measurements on an ensemble of equally prepared identical quantum systems, the latter, 
on the contrary, concerning a sequence of measurements on the same quantum system (this difference 
is well emphasized in the Jammer book ||l9|| ). The "uncertainty relations" do not have any bearing on 
the issue of the measurement disturbance, since it can be experimentally tested by measuring each of 
the observables separately: at most one of the two root mean squares, say Ap can be considered as the 
precision of the preparation, e. g. by a collimator of particle momentum, and then Aq will results from 
the statistics of measuring only q. In other words, both Ap and Aq are a priori uncertainties according 
to the Born rule, and neither will result as a consequence of the disturbance due to the measurement. 
As a matter of fact, since both Ap and Aq are intrinsical to the wave function before the measurement, 
they cannot be logically connected to the interaction with apparatus. And in fact, a measurement model 
was provocatively proposed by Ozawa |2^ in which the position of the particle can be measured leaving 
it in a eigenstate of the momentum. With no proper distinction between preparation and measurement 
(this issue is extensively analyzed in the recent paper by Muynck p]||) the two forms of complementarity 
amalgamated, leading to another erroneous interpretation of the Heisenberg principle as related to joint 
measurements (see for example the Bohm book ) . Although in quantum mechanics of Dirac and von 
Neumann joint measurement of only compatible observables are allowed — in full logical contradiction 



with the last interpretation — however, there are precise indirect models |23,^4| describing approximate 



joint measurements (which are actually achieved in a heterodyne apparatus | 2q|), and the resulting min- 
imum uncertainty product in principle is double than the Heisenberg bound [p5| p6[ | — the socalled 3dB 
of added noise in the optimal joint measurement. 

There is an extensive literature on the various misinterpretations of the principle, starting since from 
the origins. Bohr himself disagreed with Heisenberg on the gedanken experiment of the 7-ray microscope, 
as quoted in the original paper [Q. Lamb |2^ criticized the 7-ray microscope as unsuitable for position 
measurements. Historical reviews can be found, for example, in the Jammer books [^8| , [l9|] , and in 
the Beller book A serious criticism to the use of the classical definition of resolving power due to 
diffraction in the gedanken experiment is made in Ref. |Q, where Heisenberg's microscopes with super- 
resolutions violating the principle are devised. Criticisms to the use of root mean square as measures of 
uncertainty and disturbance are made in various papers (see, for example, Ref. [3l|] ). As regards the 
"uncertainty relations" , there have been many alternative derivations and generalization since from the 
origins (see Ref. psf for a detailed history). The general formulation for any pair of non commuting 
observables is due to Robertson after some relevant remarks of Condon |Q. Schroedinger |Q 
then recognized that the uncertainty product is not invariant under unitary transformations, and found 
"tighter" uncertainty relations. More recently "entropic" generalizations of the uncertainty relations were 
given [ ^5| |36| , noise-dependent relations in Ref. | |3^ , higher-order uncertainty relations also involving 
more than two operators psf , only to quote some work known to the present author. 

Coming back to the original problem of the Heisenberg gedanken experiment, even though it is clear 
that the "uncertainty relations" do not have any bearing on the issue of the measurement disturbance, 
and there is no in-principle "uncontrollable disturbance during the operation of measurement" , however, 
the issue of the minimum disturbance in-principle from a quantum measurement in relation with the in- 
formation gained from the measurement is still an unsolved problem. That a kind of Heisenberg principle 
must exist in form of information-disturbance trade-off is evident, for example, from the impossibility of 
determining the wave- function of a single system from any sequence of measurements on the same quan- 
tum system ||3^ . Such possibility has recently intrigued several authors ^ , 42 , 43], Q , which explored 



concrete measurement schemes based on vanishingly weak quantum nondemolition measurements 
weak measurements on "protected" states "logically reversible" and "physically reversible" 

p3| , Q measurements. In each of these schemes the conclusion is that it is practically impossible to 
measure the wave function of a single system, either because the weakness of the measuring interaction 
prevents one from gaining information on the wave function , or because the method of protecting the 
state actually requires some a priori knowledge on the state (this is suggested in Refs. [Q and |^), 
or because quantum measurements can be physically reverted only with vanishingly small probability of 
success . The impossibility of determining the wave- function of a single quantum system is dictated 
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by the no-cloning theorem p5[ |, which is just a direct consequence of unitarity of quantum mechanics 
p6[ . Therefore, as a consequence of the general laws of quantum mechanics, there must be a detailed 
balance between information and disturbance, which makes impossible to determine the state of a single 
quantum system from any sequence of measurements on it. 

Despite the relevance of the problem of the information-disturbance trade-off at the foundational 
level — although a consequence quantum laws — very little literature can be found on this issue, maybe due 
to the difficulty of the problem. The issue also recently became of practical relevance for posing general 
limits in information eavesdropping in quantum cryptographic communications. For such purpose, for 
example, in Ref. |47 Fuchs and Peres analyzed some trade-offs for the two-state discrimination. A part 
from this work, only few studies are known to the present author: the very interesting analysis by Fuchs 
p8| and by Barnum |^ , and, only very recently, a definite result by Banaszek on a general trade-off 
between the quality of a single state estimation and the fidelity between the input and the output states 
of the measurement. Also Ozawa has recently proposed a general trade-off, which will be mentioned 
in more details in the following. Finally, Belavkin | |5^ has given a Heisenberg principle for continuous 
measurement of the position in the framework of filtering theory. 

In this paper some results will be presented in the attempt to give general a information-disturbance 
trade-off which holds for any quantum measurement. The tradeoff must be valid "in- principle" , whence at 
the single-outcome level, not only in average over outcomes, as those considered in Refs. ^ . 

Also, since it should be valid for a general context, the tradeoff has to be independent on the particular 
analytical form of information and disturbance, which is suited to the particular problem at hand (in 
the analysis |4|, Q the fidelity between input and output has been considered as a measure of the 
"disturbance"). This requirement of generality has led us to consider trade-offs in form of majorization 
"orderings" Js^ Q between the conditional probability from the measurement and quantities related to 
the measurement effect on the input state, the former being the variables from which one can evaluate 
any kind of "information" , the latter being the source of the "disturbance" . The disturbance of the 
measurement will be related to the possibility in-principle of undoing its effect, and for this reason we 
will previously analyze in general the occurrence of probabilistically reversible measurements. We will see 
that when the measurement effect is undone, also the information retrieved from it is erased, and from 
this we will argue that in a cascade of measurements the disturbance can also be decreased, however, at 
the expense of losing the previously gained information. The case of measuring an "observable" will be 
analyzed in some detail. The majorization trade-off will then be applied to the common case of the mutual 
information retrieved from the measurement: this will lead us to a trade-off in a form of a bound tighter 
than the Holevo bound with the disturbance in the form of a Shannon entropy versus the singular 
values of the measurement "contraction" (the operator describing the effect of the single outcome of the 
measurement). As we will see, the generality of the majorization relation turns out to be a weakness 
when a specific case of information/disturbance is considered, since it proves the tradeoff validity in a 
more limited situation than the actual one, depending on the relation between the measurement and the 
ensemble of input states. Finally, we will see that the disturbance obtained in this way agrees with the 
"decrease of entanglement" due to the measurement when it acts locally on an entangled state. 



2 Information-disturbance trade-offs 

Since we are looking for an in-principle trade-off which should account for the impossibility of deter- 
mining the state of a single quantum system for no a priori knowledge, we need to consider the general 
measurement scenario, in which a sequence of measurements on a single quantum system is performed, 
with the possibility of changing the measuring apparatus at each measuring step, e. g. depending on the 
outcome from the previous step. Therefore, our information-disturbance trade-off must be valid at the 
single-outcome level, not just in average over outcomes. Moreover, to be true "in-principle" , we must 
consider a situation of perfect control on the measurement, namely the apparatus is perfectly known, and 
we are able to perform any measurement and any unitary transformation at will, according to the rules 
of quantum mechanics. In the following we will refer to such in-principle situation as perfect technology. 

Notation 

Throughout this paper, we will use boldfaced letter and square brackets to denote arrays/ vectors, 
e. g. X = [xi] = (xi,X2, ■ ■ ■)■ For any operator A on the Hilbert space H with d = dim(H), by 
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Ker{A) we will denote the kernel of A, by Rng(^) its range, by t{A) its rank, and by Pa the orthog- 
onal projector on Rng(A). We will write the singular value decomposition oi A as A — Xa^aY^, 
where = diag{cri(A), a2{A), . . . , ar{A), 0, . . . , 0} is the diagonal matrix of singular values of A or- 
dered decreasingly (including also the vanishing ones), and Xa and Ya are unitary operators of left 
and right eigenvectors respectively. By \\A\\p = o'i{A)P]p we will denote the p-Shatten norm of A, 
with ||A||i the trace-norm, ||^||2 the Hilbert-Schmidt norm, and with ||^|| = ||A||oo the usual operator 
norm. The symbol A^ will denote the Moore-Penrose pseudoinverse of A, i. e. A^ = Y^E^Xj^, with 
S*^ = di&g{a^^{A),a2^{A), . . . ,a~^{A),0, . . . ,0}, i. e. A^ is the same as A^ but with the inverse of 
the non- vanishing singular values. The Moore-Penrose pseudoinverse is completely characterized by the 
properties AA^A = A, A^AAi = A^, = A^A, and (AAt)t = AAK It follows that Pa = AAi 

and = A^A. We will denote by E = (S,a) the ensemble of states S = {ip} distributed with a priori 
probability a = [a('0)] using the abbreviate notations t/j E £ for tp £ S{£), S{£) and a(£) to denote the 
set of states and the probability distribution of the ensemble £, respectively, and \£\ the cardinality of 
S{£). The singleton set with the state ip will be denoted by the state itself ip. We will call universal 
ensemhle the imiform ensemble of all possible (pure) input states. With p£ = '^^.^g a,('ilj)\'ilj){ijj\ we will 
denote the a priori density operator of the ensemble £. The Shannon entropy of the probability vector 
a = [oi] will be denoted by H{a) = — J2i c-i logOi and for the ensemble £ we will also write equivalently 
H{£) = H{a{£)) = — X^VGf '^(V-') log«(V-')- Finally we will write £ = p£i + (1 —p)£2 for the union ensem- 
ble with S{£) = S(fi) US(f2) in which a state is picked from S(£^i) or 5(^2) with probability p and (1 — p), 
respectively, corresponding to the density operator p£ = pps^ + (1 —p)pe2J and write £ = p£i ® {1 — p)£2 
when S(£:i) -L S(£"2). 

2.1 Pure meeisurements 

A measurement with perfect technology means that we have a precise quantum description of the ap- 
paratus. Such a measurement is pure, namely it preserves purity of states. A pure measurement for a 
single outcome is described by a contraction M, namely an operator with bounded norm ||M|| < 1, to 
guarantee occurrence probability not greater than unit for any input state. The output state \iIjm) after 
the measurement and the probability p{M\il)) that M occurs on the input state |'^) are given by 

= -MM (state reduction), p(M|V') = \\Mxpf- (Born rule). (1) 

We will also regard the case of unitary M as a limiting case of "measurement" , which gives no information 
on since p(M|-(/)) — 1 independently on jf/;). This will also corresponds to no in-principle disturbance 
for any state, since with perfect technology we can deterministically reverse the effect of M without 
knowing 

2.2 Information from a single measurement outcome 

We can always regard the quantum measurement as a problem of discriminating between a set of hy- 
potheses corresponding to an ensemble £ = (S,a) of states S = {ijj} distributed with a priori probability 
a = [a(^)]. The Shannon entropy H{£) quantifies our a priori "ignorance" on which-state of the ensemble. 
When the outcome corresponding to the contraction M occurred, then our ignorance is reduced, since now 
the a priori probability distribution a = [a('0)] is upgraded to the a posteriori probability b.m = [o{i^\My\ 
that the state was f/' given that we know that M has occurred [the corresponding ensemble will be denoted 
by £m = (S,aM)]. The probability a{'tp\M) is given by the Bayes rule a{'tp\M) = a{y'j)P{M\ijj)/p£{M), 
where p£{M) = Tt[pM^ M] denotes the overall occurrence probability for Af. The information Al£{M) 
on which-state tp G £ gained from the occurrence of M is just the difference between our ignorances 
before and after the occurrence of M, namely 

AIsiM) = H{£) - H{£m) = - ^ a(V')loga(V') + ^ a(V|M) loga(V^|M). (2) 

2.3 Knowingly reversible measurements 

We say that the effect of a measurement outcome corresponding to the contraction M is knowingly 
reversible on a setS = {ip} of input states if for any a priori unknown input state G S we can perform 
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another measurement on the output state ipM of AI such that for some outcome — say corresponding to 
the contraction M — we know for sure that the new output state is the original ip, for all V' S S. In other 
words, the contraction M is knowingly reversible on S if there is another contraction M such that 

MM\^)(x\ip), y-ipeS. (3) 

This means that with some probability we can undo the effect of M with another measurement contraction 
M. The squared modulus of the proportionality constant in Eq. (^) is the overall probability of achieving 
M and knowingly reversing it with AI. If r(M) = d {M full rank), then M is knowingly reversible for 
any input state, since it is invertible as an operator. It is easy to check that, apart from an overall phase 
factor, the most efficient reversion M (i. e. maximizing the reversing probability on any input state) 
is given by M = M~^/||M~^||. In fact, by taking AI — ujAI~-^, the overall probability of achieving M 
on \^^^M) multiplied by the probability P{AI\ip) of achieving A4 on j?/;) is just jcjp and the maximum 
|a;| in order to have M as a contraction is |ti;| = ||M~-'^||~^. For the most efficient reversion AI the 
probability Prev of reversion is bounded as k~'^{AI) < prev < 1, with k(M) = ||M||||M ^|| the condition 
number of AI, and the bounds are achieved by the left vectors of the singular value decomposition of AI 
corresponding to (Ti(M) and (7d{AI), respectively. We see that the smaller the condition number k{AI) 
of AI, the higher the chance of reversing AI, i. e. the "more reversible" is M. Since the condition 
number of an operator gives also an error estimate under small perturbations of the linear action of 
the operator 1^^, this means that more reversible is AI, the more "amplified" an input perturbation will 
result at the output. Also, notice that the probability p{A/IAI\il;) of the cascade of AI and its successful 
reversion is p{AdAI\il}) = Iwp, independently on the input state j-^), and for the most efficient reversion is 
p{MAI) = cr^(M) < [rL?-^(M)]i/'^ < ^\\AI\\l The bound [Un'^liM)]^^'^ generalizes the Bhattacharyya 
overlap given in Ref. |55| for the case in which the measurement corresponds to an observable X (see 
subsection |2?5| ) . 

When AI is not full rank, i. e. r(M) < d, it is still possible to have situations in which AI is knowingly 
reversible. The first case is when the set S is orthogonally split by AI, namely it can be written as the 
union of two orthogonal subsets S = Sj^,j of which S^^ C Ker(Af) and C Ker(A/)-'- = Rng(Af^^). 
In fact, in this case we know a priori that AI cannot occur on an input state |?/') G Ker(M), whereas if 
AI occurred, then G Rng(Mt), and we can reverse AI with some probability using a contraction M 
such that AIA'I oc P^/t , namely 

M ^LoAI^ + Z{I - Pm), (4) 

where Z is any complex operator. Since M must be itself a contraction, from ||M|| — max{a;||Af-'-||, \\Z{I — 
Pm)\\} we obtain the general parametrization of the most efficient AI (a part from a phase factor) 

with Z(I — Pm) a contraction. 

As regards the case in which the set S is not orthogonally split by M, the contraction can be knowingly 
reversible only in the degenerate situation in which S is the disjoint union 5 = Sj^ U (/? of Sj^^ C Ker(M) 
with the single state (/? ^ Ker(M). Since this case is not very interesting (since it is essentially equivalent 
to reverse AI only on a single state), we will not consider it in the following. 



2.4 Negative informations: undoing a measurement erases its information 

In Ref. [ p[ Royer found an example of knowingly reversible measurement on a two-dimensional space, 
and supposed that a sequence of successfully reverted measurements could be used to determine the state 
of single quantum system with some probability, without any a priori knowledge of the state. However, 
thereafter in Ref. |Q he admitted that in fact this was not true. From Eq. we can easily see that in 
the most general case in which we are able to revert a contraction AI , the probability of achieving A/I and 
then reverting it is given by independently on the input state, whence any succession of successfully 
reverted measurements provides only the information that the input state was in Rng(Aft), e. g. for an 
ensemble £ orthogonally split by NI as £ — p£\,i © (1 — p)£m such information would be 

^h{MAI)^H{£)-H{£l). (6) 
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For uniform fjjf Eq. (|) gives A/£-(MA/) H{£) - logd^H]), and for uniform £ one has Al£{MM) = 
— \ogp — log{\£\/\£f j\). For the input universal ensemble necessarily M is reversible only if Rng(Af = H, 
and the information (g) is then exactly zero. Since the occurrence of M must have given some information 
on which-statc of £ anyway, this means that undoing the measurement must also erase the information 
from it. In fact, the information from a single measurement outcome in Eq. can be negative: the reader 
unfamiliar with negative informations should notice that the informations considered in the literature are 
always positive, since they are averaged over all outcomes, whereas generally the contribution from a 
single outcome can be negative. What does it mean to have a negative information? From Eqs. (||) 
we see that negative informations occur when the a posteriori probability distribution slm = [o-iiplM)] 
is less "peaked" around some i/i € S than the a priori probability a = [a(V')]. In practice, this means 
that the measurement result contradicts our previous knowledge (see the amusing example by Uffink 
quoted in the Peres book |]l8j). And in fact, the information Algj^^^M) from the reversion M (now 
with a priori probability given by the posterior probability slm from the previous measurement M) is 
negative, and cancels exactly the previous information Al£{M). However, it is not always possible to 
erase the information from a measurement with another one, and, in common situations the information 
is permanent, i. e. it cannot be erased as in the case of a customary von Neumann measurement. 
From the above considerations we learn the general lesson: 1) in some cases the "disturbance" of two 
measurement outcomes in cascade can be lower than that from a single measurement outcome, since, 
at least, there are cases in which we can revert the measurement — i. e. with no overall disturbance — 
whence, more generally, we can partially undo the disturbance from a previous measurement; 2) when 
some disturbance is undone, then necessarily some information is lost. 



2.5 The case of measuring an observable 

When the quantum measurement is the measurement of an observable? This is the case in which the 
positive operator valued measure (POVM) of the measurement is commutative, namely the POVM is 
jointly diagonalized on the same orthonormal basis, say In fact, let's denote by {Py} with Pj, > 
and J2y Py — ^ the POVM of the measurement. We can conveniently write the joint diagonalization as 
follows 

Py\x)=p{v\x)\x), (7) 

where the eigenvalue p(y\x) of Py on the eigenvector \x) is denoted as a conditional probability, since we 
must have p{y\x) > 0, and J2yPiy\^) = 1 — ^^'^^ ^^^t, we can interpret the eigenvalue p{y\x) as the 
conditional probability of getting y when the "true" value was x instead. It is clear that the measurement 
of an observable corresponds to our state-discriminating framework when the input ensemble is the set 
of orthogonal states {\x)}. A pure measurement that corresponds to the observable X = {\x)} must be 
made of contractions My with M'^My = Py with singular value decomposition My — XM„S(Mj,)n^yl' 
with right unitary operators Yy = Y^y giving i^jjla;) = n^|rj), namely giving the same orthonormal 
basis on which E(A/j,) — Aiag[ai{My),a2{My), . . . ,adiMy)] is diagonal, apart from a permutation 

Ily of the basis This is equivalent to say that the most general form of the contraction My 

is My — Wy p{y\x)\x) {x\ , with Wy unitary: in other words, there is a unitary Wy such that 

[W^My, |a:)(a;|] — Vx. The measurement is complete — i. e. it scans the whole spectrum cr(X) = {x} of 
the observable X with |ct(X)| = d — when r{M) = d. The measurement is non degenerate — namely each 
outcome y corresponds unambiguously to a unique most probable value x — if ai{My) > a2{My), which 
means that p{y\x) for each y has a non degenerate maximum versus x. The optimal probability prev{My) 
of reversing the contraction My is given by a'^{My) and can be conveniently bounded as prev{My) < 
[J|,^ (T^(Mj,)]^/''. Upon rewriting the singular values in terms of the conditional probabilities and after 
summing over all outcomes y we get the bound for the average reversion probability prev < B{X : Y) 
where B{X : Y) — ^y[Ylj:^cr(x)Piy\^)]^^^'^^^''^ ^^'^ Bhattacharyya overlap bound derived in Ref. |]55| . 
We see that < B{X '■Y)<1, with B{X ■.Y)=0 when p{y\x) is vanishing for some values of x, y, and 
B(X : Y) = 1 when p{y\x) is independent on x for every y. Therefore, the measurement has more chance 
of being reverted — i. e. it makes "less disturbance" — when the conditional probability distribution is 
more "flat" versus x, namely the information on x is smaller. 

The repeated application of a complete non degenerate measurement of an observable X provides 
another instructive example of the information-disturbance trade-off. In fact, we can apply the mea- 
surement many times on the same quantum system prepared in the ensemble of orthogonal states 
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compensating the measurement back-action with the conditional unitary transformation . In this way 
we win make no disturbance on the quantum system — which wiU always remain in its original state — and, 
at the same time, from the statistics of the outcomes we can also have perfect discrimination in the limit 
of infinitely many repetitions. However, since a cascade made of more repetitions will correspond to an 
overall conditioned probability more and more sharply peaked around the "right" value x, the contraction 
corresponding to the cascade will also have a decreasingly smaller chance of reversion, and in the limit of 
infinite repetitions it will approach a rank-one von Neumann measurement. Here we see that in principle 
it is possible to extract perfect non erasable information even by using a knowingly reversible measure- 
ment, however, performing the measurement infinitely many times on the same quantum system. It is 
clear that the information retrieved from the measurement on the input state can be perfect only when 
the input ensemble is otherwise it will be lower than the maximum value (given by the Holevo 

bound||l|), and, in particular, it is zero when the input ensemble corresponds to the observable Y "con- 
jugated" to X, namely the input states {\yk), k — 1, . . . d} are of the form \yk) = d^^ Sf=o^ e''^'^'^/''|a;;) 
where the spectrum of X has been labeled with xi . 

2.6 What is disturbance? 

We cannot give a definition of disturbance that can be good for all situations, since its definition must 
be suited to the particular problem at hand. For example, a definition in terms of the fidelity between 
input and output p7[| can be suited to some quantum crypto-analysis: however, we cannot consider it as 
a measure of the in-principle disturbance on the measured system, since we would have disturbance also 
from a unitary transformation, which can be reversed at will on any unknown input state. As another 
example, when we want to account for the possibility of reversing the measurement approximately by 
a unitary transformation, a suitable definition of the disturbance D{M) from a contraction M should 
seize how much the output IV'a/) in Eq- (|l|) is unitarily uncorrelated with the input since we would 
say that there is no disturbance if l^p) and \iPm) are connected by a fixed unitary transformation — 
say V — independently on lip). Then we would define the "disturbance" as D{M) = 1 — C{M), where 
C{M) is the input-output unitary correlation of M defined as the fidelity between \4'm) and Vlip) for 
unitary V, averaged over all [ip) [with the joint probability p(M, ip)], and then maximized over V, namely 
C(M) = maxv |(V'|y^|V'M)p. A straightforward calculation gives C(M) = + \\M\\l]. We can 

see that C{M) approaches its maximum C{M) ~ 1 for contraction M close to a unitary (all singular 
values approach 1), whereas it is minimum C{M) — 2/d{d + 1) for a rank-one M. Notice that here 
D{M) = 1 — C(M) is a Schur-convex function of the vector [cr^(M)] of squared singular values of M . 

The "disturbance" D{M) = I — C{M) sizes our inability of approximately revert M by a unitary 
transformation. More generally, if we want to define D{M) in a way which is related to our ability in- 
principle of reversing M, we must consider that reversion is generally achieved by another measurement. 
Then, the definition of disturbance must satisfy the following requirements: 

1. The disturbance D{M) due to M must be a function only of the probabilities of reversing its effect, 
not on how the reversion is performed. Therefore, we must have D[M) = D{UM), for all unitary 
U, namely the disturbance is a function only of the POVM element M^M of the measurement. 

2. If we look for a definition of D{M) which is a property of M only, independently on the input 
state, then in addition to the requirement ^ we must also have D{M) = D{MV) for all unitary 
V . This means that the disturbance must be a function of the singular values of M only, namely 
D(M) = f{{(ji{M)]). Therefore, our definition of D{M) should be of this form at least for the 
input universal ensemble. 

3. We expect that the disturbance will be minimum for unitary M, and maximum for r(M) = I 
(Gordon-Louisell measurement |24j, e. g. von Neumann): since in general the definition of D{M) 
should also depend on the input ensemble, these two extreme cases at least should hold for the case 
of the input universal ensemble. 

2.7 Majorization trade-offs. 

In the search for general trade-offs between "information" and "disturbance" for a quantum measurement 
at the single-outcome level we will try to accomplish the following aim. While satisfying the above 
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requirements we look for general inequalities which will guarantee the trade-ofT independently on 
the specific quantities that will be used for both "information" and "disturbance" , to be suited to the 
particular problem at hand. Notice that the usual information in Eq. is the sum of two contributions, 
of which the first one H{£) is independent on M , whereas the second —H(£m) is a Schur convex function 
of the conditioned probabilities a (M| ■(/;). Therefore, if we want our trade-off to be true also for the usual 
information (||), we should look for a majorization relation rm -< zm between the vector slm — [<i{ip\M)] 
and a vector zm = 'z{ai{M),£) having components that depend on the singular values o'i(M) of M 
along with quantities related to the ensemble £, and such that for the input universal ensemble will be a 
function of ai{M) only [for majorization theory see Ref. This will guarantee the trade-off by 

just taking for zm the same Schur-convex function / ~ —H{a.M) that we have in the information, namely 
/(zm) = —H{zm)- Moreover, the majorization relation will guarantee the trade-off for any other choice 
of Schur-convex function, depending on the problem, in which the "information" is a function of slm, 
and the "disturbance" is the same function of zm- Notice, however, that the power of the majorization 
approach, is also its weakness. In fact, since a majorization relation will guarantee the trade-off for all 
Schur-convex functions, it may be possible that for a given function (/ = —H in our case) the trade-off 
could be true more generally than for slm ^ zm- Finally, we want to emphasize that the convexity of 
the function / is unrelated with the assertion that "the disturbance from a set of M randomly chosen 
is always lower than their averaged disturbance" , since in our case the definition of disturbance is given 
only for pure contractions, as we are concerned only with pure measurements. On the other hand, as we 
will see in the following, when we consider the complete measurement with all possible outcomes, we can 
easily average the trade-off over the outcomes with their probabilities of occurrence. 

Looking for a majorization relation involving hm is equivalent to look for a majorization relation for 
the joint probabilities a{M,ip), since the two are related by a fixed normalization constant given by the 
overall probability p£{M) of occurrence of M. It is easy to derive a weak majorization relation as follows 

d d 
i=l i=l 

(8) 

where M = Xm^mYm is the singular value decomposition of M, and is an orthonormal basis on 
which Em has the canonical diagonal form. The rectangular matrix Wji = a('0j)(*I^MlV'j)P is double 
sub-stochastic, since J^i^ji = o('0j)Tr[y^|V'j)(V^il^M] = a{tpj), and J^j^ji = (*I^MP£^M|i) < 1- This 
means that the following weak majorization relation (symbol ~<w) holds 

[a{M,i;,)]^^[aUM)]- (9) 

However, the weak majorization relation will guarantee trade-offs for a choice of Schur-convex function 
that is also increasing on its domain [again, this does not mean that the trade-off cannot hold for 
some particular Schur-convex function] . 

A majorization relation between the vector [a{M,^j)] and a vector containing the singular values of 
M can be obtained by expanding the probability a{M,tpj) as follows 

a(M,^,) = a{^p,)(^,\YM^l,Yl,\4,,) = ^ a(^,)l(^.|n^|*)|V,2(M) = ^%A,a2(M), (10) 

i i 

where 

K^{i\YlpeYM\^. Sy, = a{^M^,\YMm^\-\ (U) 

Notice that — a(V'j)l(*I^Ml'/'j)P and = if and only if [(V'jlyA/l*)!^ = 0, Vj, and the sum in Eq. 
( p^ is extended only to those terms for which Ai > — say for i = 1, . . .r < r{M). It follows that the 
X r matrix S has the following rows and column sums 

S,^ = a{^j){^,\YMC'ylM3) =sj, E = 1' (12) 
* j 

where C = J2i Notice that generally C ¥^ Ps and we have C = P£ when pg is diagonal with M^M, 

namely when [p£,M^M] = 0, in which case we are guaranteed that Sj < l,Vj, whereas in general Sj 
can be greater than unit. We will call the ensemble £ parallel to M when p£ commutes with M , and 
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quasi-parallel to M when sj < l,Vj. Ensembles that are parallel to any M are obviously the maximally 
chaotic ones, for which pg = d^^I. For ensembles quasi-parallel to M the \£\ x r matrix S in Eq. ( pi] ) 
can be augmented to a {\£\ + r) x {\£\ + r) stochastic matrix as follows 



5* 


diag{l - Sj} 








(13) 



By padding the vectors [a(A'f, -0^)] and [Xicrf{M)] with r and \£\ additional zeros, respectively, Eqs. (|To|) 
and ( p^ ) guarantee the following majorization relation 

and upon normalizing both vectors we have 

with 



(14) 

(15) 
(16) 



For ensembles £ that are not quasi-parallel to AI we can always build a squashed ensemble £ that is 
quasi-parallel to M by replicating the state \ipj) corresponding to Sj > 1 in sufficiently many identical 



copies IV'P^) = 
Sj maxj^;^^} < 1 



lipj) distributed with probabilities a{'ipi 



i3)^ 



gp)a(V',), 



with 



1, such that 



2.8 Information disturbance trade-offs 

From Eq. (^5|) it follows that for ensembles quasi-parallel to M we have —H(a.M) < —H{zm), and for 
the information on which-state retrieved from the occurrence of M we have 



AIsiM) < H{£) - H{zm). (17) 
If the ensemble is not quasi-parallel to A/, by considering any squashed ensemble £ we obtain 

AIs{M) < H{£) - H{zm) - 5][a(^,) -p(0,|M)]iJ(q(j)), (18) 

3 

but, unfortunately, the last quantity in Eq. (^) has no definite sign. For this reason, in the following we 
will focus attention only on ensembles that are quasi-parallel to M. 

When considering a complete pure measurement A4 = [Mi, M2, ■ . . Af„] with J^i mJ Mi = I we can 
average both sides of Eq. (^^ on outcomes i with the probability of occurrence p£{Mi), and obtain 

Ah{M)<H{£)- {H{zM,)), (19) 

where (. . .) denotes the averaging over outcomes i. The quantity —H{zm) can be regarded as a kind of 
"disturbance" due to M. Notice that 



-logr(Af) < -i/(zM) <0, (20) 

The disturbance is minimum when af{M) oc A^^, and maximum for rank-one M (Gordon-Louisell 
measurements) or when there is only one right- vector Y\i) of M in the range of p£. Notice that for general 
ensemble the disturbance is not minimum for unitary M: this is a phenomenon due to the occurrence 
of negative informations analyzed previously, e. g. a measurement reverting a previous one undoes its 
disturbance, namely it makes "less disturbance" than a unitary transformation. In particular, when the 
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ensemble is orthogonally split by M and fjl^ is itself orthogonal, then the minimum disturbance will be 
exactly equal to the information gain — in Eq. from a successfully reverted measurement. For 
orthogonal ensembles (generally not split) we have in average over outcomes 

AIsiM) < S{p£) - {H{zm)) < x{£), (21) 

where S{p) = — Tr[/)logp] denotes the von Neumann entropy, and x(f) — S{p£) — cijS{pj) is the 
Holevo bound for the ensemble with density operator pg = "Y^- ajPj for a priori probabilities and states aj 
and pj , respectively. Eq. ( |2l| ) gives a bound for the information retrieved from the single outcome that is 
tighter than Holevo bound [in our case the a priori states pj — |V'j)(V'jl ^re pure, and xi^) = S{p£)]. The 
information disturbance trade-off (|2^) asserts that we can make less disturbance at the price of retrieving 
less information than the available one. Also notice that in the present case of orthogonal input ensemble 
a measurement A4 made of random unitary transformations will give minimum disturbance and zero 
information. 

We want to focus now on the simplest case in which the ensemble £ is parallel to M. Here we have 

-HizM)^-Siip£)M), (22) 

namely our disturbance is equal to the opposite of the von Ncuman entropy of the "reduced" density 
operator {p£)m 

MpeM^ 

^P'^^ = Tr[Mp,Mt]- (23) 

From Eqs. ( p^ ) and (^) we also see that for ensembles parallel to M our "disturbance" is also exactly 
equal to the "reduction of entanglement" that M would produce locally on any entangled state \^) that 
is a purification of ps, namely, for 

M (g) /I*) 

1^-^^) ^ ^ Tr2[|«')(*|l = p£ (24) 

we will have —H{zm) — — '5'(Tr2[|'I'7\/}(^A/|])- Notice that in general, M can also probabilistically 
increase the entanglement of jvP): this situation corresponds to the occurrence of negative informations 
mentioned above, with disturbance less than that from a unitary transformation. In the special case in 
which the ensemble is also maximally chaotic (e. g. for the universal ensemble), our disturbance will be 
given by 

and the less disturbing is M , the "more flat" are its singular values, with the largest mutual information 
being achievable with rank-one measurements. This situation is depicted in Fig. ^. From Eq. ( p5| ) we see 
that our disturbance "interpolates" the definition of disturbance D{M) — — logr(Af) proposed by Ozawa 
1 5l| for the trade-off /(X I p) < S'(p)-logr(M^) for the "information gain" I{X\p) = S{p)-Y,^p{x\p)S{px) 
[57 1 from a pure quantum measurement made of contractions Mx all with the same rank v{Mx), with p 
the input state, plx\p) = Tr[Mj;M^/9], and p^ = M^pMl/TrlMlM^p] the output state. 

We conclude this section by noticing that the present definition of disturbance explains the information- 
disturbance trade-off in quantum teleportation[|5^ , between the Alice's information on the transmitted 
state and the disturbance at Bob on the received state, the trade-off being tuned by switching on-off the 
entanglement of the shared resource. Indeed, it is easy to see that in any teleportation scheme in which 
Alice performs a generic Bell measurement [^8|, the disturbance is just the opposite of the entanglement 
of the state l^*) of the shared resource. 



3 Concluding remarks 

In this paper we have considered an ideal in-principle quantum measurement at the single-outcome level, 
which is then described by a single contraction. We have analyzed the possibility of measurements that 
are knowingly reversible, showing that measurement reversion necessarily erases the information from 
the reverted measurement. This also clarifies that it is possible in principle to undo the effect of a 
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Figure 1: More and less disturbing measurement contraction M (for input universal ensemble): the less 
disturbing M (on the left) has "more flat" singular values. 



measurement, however, at the expense of losing some previously retrieved information. Information- 
disturbance trade-offs have been presented, where the "disturbance" depends on the probabilities of 
reverting the measurement. Two majorization relations have been given: the weak majorization (^, 
which holds for any ensemble, and the majorization ([l5|), which hold for ensembles "quasi-parallel" to the 
measurement contraction M . These relations represent trade-offs that are independent on the particular 
analytical form of information and disturbance. When considering the customary mutual information, 
the majorization (15) leads us to consider the quantity —H{zm) as a "disturbance", with the vector 
ZM depending on the singular values of M and on the input ensemble £ as given in Eq. (|l|). Such 
quantity satisfies all the requirements that we gave for a general disturbance, and behaves as expected in 
all known cases. Even though the information-disturbance trade-off ( p7| ) has been proved for ensemble 
quasi-parallel to Al (since it has been derived from the majorization relation (|l^)) Eq. (|lj) can have a 
more general validity, and an alternative derivation will be the subject of a forthcoming work. 
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